virtualize application
Scaling Distributed Machine Learning leveraging vSphere, Bitfusion and NVIDIA GPU (Part 1 of 2) - Virtualize Applications
Organization are quickly embracing Artificial Intelligence (AI), Machine Learning and Deep Learning to open new opportunities and accelerate business growth. AI Workloads, however, require massive compute power and has led to the proliferation of GPU acceleration in addition to traditional CPU power. This has led to a break in the traditional data center architecture and amplification of organizational silos, poor utilization and lack of agility. While virtualization technologies have proven themselves in the enterprise with cost effective, scalable and reliable IT computing, Machine Learning infrastructure however has not evolved and is still bound to dedicating physical resources to optimize and reduce training times. Bitfusion helps enterprises dis-aggregate the GPU compute and dynamically attach GPUs anywhere in the datacenter just like attaching storage.
- Information Technology > Hardware (0.48)
- Information Technology > Software (0.44)
Distributed Machine Learning on VMware vSphere with GPUs and Kubernetes: a Webinar - Virtualize Applications
This article directs you to a recent webinar that VMware produced on the topic of executing distributed machine learning with TensorFlow and Horovod running on a set of VMs on multiple vSphere host servers. Many machine learning problems are tackled using a single host server today (with a collection of VMs on that host). However, when your ML model or data grows too large for one host to handle, or your GPU power happens to be dispersed across several physical host servers/VMs, then distribution is the mechanism used to tackle that scenario. The VMware webinar introduces the concepts of machine learning in general first. It then gives a short description of Horovod for distributed training and explains the importance of low latency networking between the nodes in the distributed model, based here on Mellanox RDMA over Converged Ethernet (RoCE) technology.
Distributed Machine Learning on vSphere leveraging NVIDIA vGPU and Mellanox PVRDMA - Virtualize Applications
While virtualization technologies have proven themselves in the enterprise with cost effective, scalable and reliable IT computing, High Performance Computing (HPC) however has not evolved and is still bound to dedicating physical resources to obtain explicit runtimes and maximum performance. VMWare has developed technologies to effectively share accelerators for compute and networking. It is also possible to provision multiple GPUs to a single VM, enabling maximum GPU acceleration and utilization. With the impending end to Moore's law, the spark that is fueling the current revolution in deep learning is having enough compute horsepower to train neural-network based models in a reasonable amount of time The needed compute horsepower is derived largely from GPUs, which NVIDIA began optimizing for deep learning since 2012. The latest GPU architecture from NVIDIA is Turing, available with T4 as well as the RTX 6000 and RTX 8000 GPUs, which all support virtualization. .
- Information Technology > Hardware (0.98)
- Information Technology > Software (0.65)
GPUs for Machine Learning on VMware vSphere: Decision-maker's Guide - Virtualize Applications
Are you being asked to provide GPUs to your application developers and data scientists for machine learning or high performance computing? Are users asking for more than one GPU to be usable for their application? Are you interested in cost-effective ways to share GPUs across the entire data science team? If any of these types of questions apply to you, then this new E-Book from VMware on the key decisions to take about GPU use on vSphere will be a great read for you. GPUs provide the computing power needed to run machine learning programs efficiently, reliably and quickly.
- Information Technology > Virtualization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science (0.97)
Machine Learning with H2O - the Benefits of VMware - Virtualize Applications
This brief article introduces a short 4.5 minute video that explains the reasons why VMware vSphere is a great platform for data scientists/engineers to use as their base operating platform. The video then demonstrates an example of this, showing a data scientist conducting a modeling experiment with an input set of data, while using the Driverless AI tool from H2O.ai to do the data analysis and model training, all in VMs. The key idea here is that the world of machine learning/data science is rapidly changing, with new, powerful tools, platforms and versions appearing and upgrading at a very fast pace. The tool vendors are racing to innovate here and producing new workbenches for the both the expert and the novice in the field. Data scientists and data engineers (who organize and cleanse the data first) want to be able to try out these new tools and updated versions of the tools while keeping a stable environment for their existing production deployments.
- Information Technology > Software (0.65)
- Banking & Finance (0.53)
Machine Learning using Virtualized GPUs on VMware vSphere - Virtualize Applications
Machine learning (ML) has recently undergone huge progress in research and development. Importantly, the emergence of deep learning (DL) and the computing power enhancement of accelerators like GPUs have together enabled a tremendous adoption of machine learning applications. This has caused machine learning to have a broader and deeper impact on our lives in many areas like health science, finance, security, data center monitoring and intelligent systems. Hence, machine learning and deep learning workloads are also growing in the datacenters and cloud environments. To support customers with deploying ML / DL workloads on VMware vSphere, we conducted a series of performance studies on ML-based workloads using GPUs.